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Abstract 

We consider the issue of assessing influence of observations in the class of 
Birnbaum-Saunders nonlinear regression models, which is useful in lifetime 
data analysis. Our results generalize those in Galea et al. [2004, Influence 
diagnostics in log-Birnbaum-Saunders regression models. Journal of Applied 
Statistics 31, 1049-1064] which are confined to Birnbaum-Saunders linear re- 
gression models. Some influence methods, such as the local influence, total 
local influence of an individual and generalized leverage are discussed. Ad- 
ditionally, the normal curvatures of local influence are derived under various 
perturbation schemes. 

Key words: Birnbaum-Saunders distribution; Fatigue life distribution; Influ- 
ence diagnostic; Generalized leverage; Lifetime data; Local influence; Maxi- 
mum likelihood estimation. 

1 Introduction 



The family of distributions proposed by Birnbaum and Saunders (1969), also known 
as the fatigue life distribution, has been widely applied for describing fatigue life, 
and lifetimes in general. This family of distributions was originally obtained from 
a model for which failure follows from the development and growth of a dominant 
crack. It was later derived by Desmond (1985) using a biological model which 
followed from relaxing some of the assumptions originally made by Birnbaum and 
Saunders (1969). 

The random variable T is said to have a Birnbaum-Saunders distribution, say 
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B-S(a,f]), if its density function is given by 
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where a > and rj > are shape and scale parameters, respectively. The den- 
sity is right skewed, the skewness decreasing with a. For any k > 0, it follows 
that ~ B-S(a, kr\). Some interesting results about improved statistical inference 
for the B-S(a,r]) may be revised in Lemonte et al. (2007, 2008). Some gener- 
alizations and extensio n s of t he B irnbaum-Saunders distribution are presented in 



Diaz Garcia and Leiva 



2005) and 



Gomes et al. 



(120091 ) 



Rieck and Nedelmanl (Il99ll ) proposed a log-linear regression model based on 
the Birnbaum-Saunders distribution. They showed that if T ~ B-S(a,rj), then 
Y = log(T) is sinh-normal distributed, say Y ~ SAf(ot, fi, c), with shape, location 
and scale parameters given by a, n — log (7/) and o = 2, respectively. Diagnostic 
tools fo r the Birnbaum-S aunde rs regression mode l were developed by 



Galea et al. 



(J200J) 



Leiva et al. 



the likelihood ratio test can be found in 



(120071 ) and Ki and W ei ( 20 071) . Small-sa mple adjustments for 



Lemonte et al. 



(I2009h . 



Recently, iLemonte and Cordeirol (120091 ) proposed a new class of Birnbaum-Saunders 
nonlinear regression models . The class generalizes the regression model described by 
Rieck and Nedelmanl (Il99ll ). Additionally, the authors discussed maximum likeli- 
hood estimation for the parameters of the model, and derive closed-form expressions 
for the second-order biases of these estimates. 

Diagnostic analysis is an efficient way to detect influential observations. The 
first technique developed to assess the individual impact of cases on the estimation 
process is, perhaps, the case deletion which became a very popular tool. How- 
ever, case deletion excludes all information from an observation and we can hardly 
say whether that observation has some influence on a specific aspect of the model. 
To overcome this problem, one can resort to local influence approach where one 
again inves t igates the model sensibility under small perturbations. In this con- 
text, ICook! (119861 ) proposes a general framework to detect influential observations 
which give a measure of this sensibility under small perturbations on the data or 



in the model. Several authors have extended the loca 
regres s ion m o dels; see, for exampl e 



Paulal (I1993T). iLesaffre and Verbekd (19981) and, more recently 



Espinheira et all (120081 ) 



Paula et al 



influence method to various 



Lawrancd (119881 ). iThomas and Cook! ( 



Osorio et al. 



1990) 



(120071 ) 



(120091 ). among others. 
In this article, we present diagnostic methods based on local influence and gen- 
eralized leverage in the class of Birnbaum-Saunders nonlinear regression models. 
Our results generalize those in Galea et al. (2004) which are confined to Birnbaum- 
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Saunders linear regression models. In Section [2j we present the class of Birnbaum- 
Saunders nonlinear regression models. The score functions and observed Fisher 
information matrix are given as well as the process for estimating the regression 
coefficients and the shape parameter. Derivations of the normal curvature under 
different perturbation schemes together with generalized leverage are made in Sec- 
tion [3J Finally, Section @] concludes the paper. 



2 Birnbaum— Saunders nonlinear regression model 

Let T ~ B-S(a, rf). The density function of Y = log(T) has the form 

Tr(y; a, //, a) = -= cosh ( - — — J exp < — ^r-sinh 2 f - — — ) 1 , y e 1R. 

aaV^Ti V ° ) I ° V ° ) J 

This distribution has a number of interesting properties: (i) It is symmetric around 
the location parameter fi; (ii) It is unimodal for a < 2 and bimodal for a > 2; 
(iii) E(y) = fi and it s varia nce is a function of a only, and has no closed-form 
expression, but Rieckl ( 1989 ) obtained asymptotic approximations for both small 



and large values of a; (iv) If y a ~ SAf(a, a), then Z a = 2(y a — ji)/ (aa) converges 
in distribution to the standard normal distribution when a —>■ 0. 

Lemonte and Cordeiro (2009) proposed the following regression model: 

Vi = fi(xi, (3) + £4, i=l,...,n, (1) 

where yi is the logarithm of the ith observed lifetime, Xi = (xa,Xi2, . . . ,Xi m ) T is 
an m x 1 vector of known explanatory variables associated with the ith observable 
response y^ j3 — (flx, fa, ■ ■ ■ > Pp) T is a vector of unknown nonlinear parameters, and 
Si ~ SAf(a,0,2). We assume a nonlinear structure for the location parameter /ij 
in model ([1]), say /ij = fi(xi]j3), where fi is assumed to be a known and twice 
continuously differentiable function with respect to /3. 

The log-likelihood function for the vector parameter 8 = (/3 T , a) T from a random 
sample y = (yx, y 2 , ■ ■ ■ , y n ) T obtained from ([T]), can be expressed as 

n 

£(e) = J2^ e )> ( 2 ) 

i=i 

where 4(0) = - log(87r)/2 + logfe.i) - ^/2, 

^ = = ! ooBh(^^), 6 2 = ^(0) = |sinh(^i), (3) 

for i = 1,2, ... ,n. The n x p local matrix Z) = D(f3) = dfi/df3 of partial deriva- 
tives of /x = (fjLi,fi2j ■ ■ ■ , fin) T with respect to (3 is assumed to be of full rank, i.e., 
rank(Z)) = p for all (3. 
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The score functions for (3 and a can be expressed, respectively, as 

1 n 1 n 
Up = -D T s and U a = -- + - V 

2 a a ^— ' 

i=i 

where s = s(0) is an n-vector whose zth element is equal to — £12/611- The 
MLE 9 = (/3 T , a) T satisfies p + 1 equations: Up = and U a = 0. A joint iterative 
procedure to obtain the MLEs of (3 and a is given by (Lemonte and Cordeiro, 2009) 

p(m+l) = ( J jWT J j(m)j-l J jHT^(m) ) ^(m+l) = + gm)^ m = , 1, . . . , 

where C (m) = £> (m) /3 (w) + {2/^(a (m) )}s (m) , = £? =1 £? 2 (m) /n and V(a) = 2 + 
4/a 2 - a- 1 v / 27r{l - erf (v^/a)} exp(2/a 2 ). Also, erf(-) is the error function (see, 
for example, Gradshteyn and Ryzhik, 2007). It can be shown that i/j(a) ~ 1 + 4/a 2 
for a small and ^(cO ~ 2 for a large. The above equations show that any software 
with a weighted linear regression routine can be used to calculate the MLEs of (3 and 
a iteratively. Starting values (3^ and for the iterative algorithm are required. 

The asymptotic inference for the parameter vector 9 = ((3 T , a) T can be based on 
the normal approximation of the MLE of 9, 9 = (/3 T ,S) T . Let S# the asymptotic 
variance-covariance matrix for 9. Then, for n large, 6 ~ Mp+i(0, Sg), where ~ 
denotes approximately distributed. Additionally, may be approximated by 
where Lgg is the x observed information matrix evaluated at 9, 

obtained from 



Lp/3 


L(3a 




~D T VD + ±[s T ][G\ 


D T h 








h T D 


ti(K) 



where V = diag{v u v 2 , . . . , v n }, v { = v t {9) = -{2£ 2 2 + 4/a 2 - 1 + £ 2 2 /£ 2 J/4, h = 
(h h h 2 ,...,h n ) T , hi = hi(9) = -(ufe/a, K = diag{fei, k 2 , . . . , k n }, h = h(9) = 
1/a 2 — 3£ 2 2 /a 2 and G = G(f3) = d 2 /-i/df3df3 T is an array of dimension n x p x p. 
Finally, [■][•] represents the bracket product of a matrix by an array as defined by 
Wei (1998, p. 188)0 

3 Diagnostic analysis 
3.1 Local Influence 

The local influence method is recommended when the concern is related to investi- 
gate the model sensibility under some minor perturbations in the model (or data). 

1 If A is an n x p x q array and B is an m x n matrix, then C = [A] [B] is called the bracket 
product of A and B, that is an m x p x q array with elements Y t ij — £\=i B t kAkij- 
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Let bj be a fc-dimensional vector of perturbations, the perturbed log-likelihood func- 
tion is denoted by £(6\u>). We consider that exists a non perturbation vector, namely 
lOq, such that £(0\ujq) = £(0). The influence of minor perturbations on the maxi- 
mum likelihood estimate can be assessed by using the log-likelihood displacement 
LD U = 2{£(0) — £(0u>)}, where 0^ denotes the maximum likelihood estimate under 
t{0\u). 

The Cook's idea for assessing local influence is essentially to analyse the local 
behavior of LD^ around u> by evaluating the curvature of the plot of LD^+ad 
against a, where a € IR and d is a unit norm direction. One of the measures of 
particular interest is the direction d max corresponding to the largest curvature Cdmax- 
The index plot of d max may evidence those observations that have considerable 
influence on LD^ under minor perturbations. Also, pl ots of d ma . v a gainst covariate 



values may be helpful for identifying atypical patterns. ICook shows that the 

normal curvature at the direction d is given by 



CJO) 



2\d T A T L e ^Ad\ 



where A = cP£(6\uj) /dOdu> T , both A and Lgg are evaluated at and lj . Hence, 
Cdmax is t ne largest eigenvalue of B — —A t L qA and d max is the corresponding 
unit norm eigenvector. The index plot of d max for the matrix B may show how to 
perturb the model (or data) to obtain large changes in the estimate of 0. 

However, if the interest lies in computing the local influence for (3, the normal 
curvature in the direction of the vector d is Cd-,p{0) = 2|d T A T (X^ — JD 2 2)Ad|, 
where 
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and d B 
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p here is the unit norm ei genvector cor responding to the largest eigenvalue 



■A T (L 
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00 



Loo) A 



sec 



Cook 



1986 



Eq. (26)). The index plot of the 



largest eigenvector of B\ may reveal those influential observations on f3. 

Another procedure is the total local curvature corresponding to the zth element, 
which follows by taking ^ or an n x 1 vector of zeros with one at the ith position. 
Thus, the curvature at the direction dj assumes the form Cj(0) = 2\AjLggAi\, 
where A, T denotes the ith row of A. This is named total local influence (see, for 
instance, 



Lesaffre and Verbeke 



1998). It is also possible to compute the total local 



influence of the ith individual when estimating a subset of the elements of 6. For 
instance, if the interest lies in /3, we have that C i; p(0) = 2\Aj(L e g — L 2 o)A i \. 
Verbeke and Molembergs (2000, § 11.3) propose considering as point out those cases 
such that Ci > 2C, where C = Y17=i Ci/ n - 
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3.2 Curvature calculations 



Next, we calculate, for three different perturbation scheme, the matrix 

A = {A ri }( P +i)xn =| qq duj \, r = 1,2, . . . ,p+ 1 and i = 1, 2, . . . , n, 

considering the model defined in ([T]) and its log-likelihood function given by (jSJ). In 
what follows, the quantities distinguished by the addition of are evaluated at 
6 = T ,a) T . 



3.2.1 Case-weights perturbation 

The perturbation of cases is done by defining some weights for each observation in 
the log-likelihood function as follows: 

n 
i=l 

where u> = (o>i, u 2 , ■ ■ ■ , oj n ) T is the total vector of weights, with < uji < 1, for 
i — 1, 2, . . . , n, and ujq — (1, 1, . . . , 1) is the vector of no perturbations. The matrix 
A is given by 




where A^ = D T diag{ai, a 2 , ■ . . , a n }, with a* = (§i& 2 - §2/&i)/2, and A Q = 
(61,62, • • - A), with 6< = -1/S + Ca/a. Also, Si = and §2 = £2(0), where 

£ii and £ i2 were defi ned in (El). N ote that, for linear models, the matrix A reduces 



to the ones given in balea et all J2004T ). 



3.2.2 Response perturbation 

We will consider here that each yi is perturbed as yi w = yi+u>iS y , where S y is a scale 
factor that may be estimated standard deviation of y. In this case, the perturbed 
log-likelihood function is given by 

n n 

£(0\u,) = -|log(87r) +^log(e ital ) - 2E£U, 

t=l i=l 

where = 6i Wl (0) = 2a _1 cosh([?/™ - /ij]/2), £ i2wi = £ i2wi {0) = 2a' 1 smh([y iw - 
A*i]/2) and u> = (0,0,..., 0) T is the vector of no perturbations. The matrix A 
assumes the form 
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where A^ = D T dia^{c lj c 2 , . . . ,%}, with q = S y (2g 2wi + 4/S 2 - l + 
and A a = (d 1 ,d 2 , ■ ■ ■ ,d n ), with d» = S y ^a wi l i2wi /a. Also, £ iUui = £a wi {8) and 
697,,, = 6.971M (0)- I t is noteworthy that the matrix A reduces to the ones given in 



Galea et al. 



J()()"i[) for linear models. 



3.2.3 Explanatory variable perturbation 

Consider now an additive perturbation on a particular continuous explanatory vari- 
able, namely Xj, by making X{j W = Xij +u>iS x , where S x is a scale factor that may be 
estimated standard deviation of Xj. This perturbation scheme leads to the following 
expression for the log-likelihood function: 



n 1 n 

e\u>) = -^iog(87r) + ^iog(w) --£e 



i=l 



2 

i=l 



where £a m = &i Wa (0) = 2a 1 cosh([y i - fi iw ]/2), £ i2vi2 = &2w 2 (0) = 2a 1 sinh([y i 
fi iw }/2) and n iw = fi(x iw ,(3), with x iw = (x a , . . . , x ijw , . . . , x im ) T . Here, uj 
(0, 0, ... , 0) T is the vector of no perturbations. The matrix A is given by 



where A ,3 is apxn matrix with A r j elements that assume the form (for r = 1,2, 
and % — 1, 2, . . . , n) 



i2u>2 



1 — ( W I > J- 



fJ'iwfJ'irw ( 1 1 
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i2w2 



ilui2 



where £ ilto2 = 6i«, 2 (^), &2™ 2 = ^ (0) and 



df3 r duOi 



djjii. 



duji 



and fii 



8=8,U!=U>Q 



■P 



8=8 ,ijj=ijJo 



Additionally, A a = (ei,e 2 , . . . ,e n ) with ei = -fii W Ciiw 2 ^i2w 2 /^ 

For linear mod els, i.e. /ij = xj(3, the matrix A^ reduces to the ones given in 



Galea et al. 



(j2004j). Note that \i iw = xJ/3 + [3jWiS x . Thus, jl iw = (r ^ j) and 
jiiw = S x (f = 7"h tii r ,,, = X j r and fi iw = S x f3j. Clearly, A Q also reduces to the ones 
given in [Galea et al. ( 2004 ) for linear models. 



3.3 Generalized leverage 



Wei et al. 



(1998) 



In what follows we shall use the generalized leverage proposed by 
which is defined as GL(0) = dy/dy T , where is an s- vector such that E(y) = ii(0 
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and is an estimator of 0, with y = fj,(&). Here, the (i, I) element of GL(0), i.e. the 
generalized leverage of the estimator at (i, I), is the instantaneous rate of change in 
zth predicted value with respect to the /th response value. As noted by the authors, 
the generalized leverage is invari ant under reparam eterization and observations with 
large GLij are leverage points. 



Wei et al. 



(119981 ) have shown that the generalized 



leverage is obtained by evaluating 



GL(0) = D f 



Lee) 1 I j e yi 



at = 0, where D = d^i/dO T and L 0y = d 2 £{0)/dOdy T . 
Under model defined in (JT]), we have that 



D e = [D 0] and L 



0y 



D T diag{v 1 ,v 2 , . . .,v n } 
h 



where v j (i = 1,2, ... ,n) and h are thos e as defined i n Sec tion [2j It is noteworthy 
that GL(6) reduces to the ones given in I Galea et al. 



(120041 ) for linear models. 



4 Concluding remarks 

The Birnbaum-Saunders distribution is widely used to model times to failure for 
materials subject to fatigue. In this paper, we developed influence diagnostics for 
the class of Birnbaum-Saunders nonlinear regression models which can be useful for 
modeling lifetime or reliability data. Appropriate matrices for assessing local influ- 
ence on the parameter estimates under different perturbation schemes are obtained. 
Our results are very general and can be applied to any nonlinear regression model 
defined by ([!]). In particular, our results generalize those in Galea et al. (2004) which 
are confined to Birnbaum-Saunders linear regression models. 
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